Netron output (ONNX JSON) #3717

CharlieL7 · 2024-12-16T21:30:27Z

Adds --netron option to the MIGX driver that outputs a JSON file in the ONNX format that is readable by the Netron application
~~Need to test out a couple different ways of handling base64 encoding~~
Probably want to make issues for each of the TODOs in the code

… into onnx_json_output

pfultz2 · 2024-12-16T22:54:12Z

Add the base64 dependency to the requirements.txt file. Need to add find_path for it in the CMake to add the include path.

src/netron_output.cpp

TedThemistokleous

Looks good. Just fix your Tidy Errors

test/base64_test.cpp

CharlieL7 · 2024-12-18T22:33:42Z

I found a bug with the implementation because of the bit math used that doesn't work with the alterations I made. ~~Going back to using a library instead since I have not found a decently compact snippet that passes CI.~~ Fixed the bug, needed to ignore the clang warning about unsafe buffer access since method directly works on the bytes of the input string.

…DMIGraphX into onnx_json_output

src/base64.cpp

… into onnx_json_output

src/base64.cpp

lakhinderwalia

Nit. Complicated arithmetic shifts.
(Approved).

src/netron_output.cpp

src/base64.cpp

pfultz2 · 2024-12-19T20:09:02Z

src/base64.cpp

+std::string b64_encode(const std::vector<byte>& buf)
+{
+    std::size_t len = buf.size();
+    std::vector<byte> res_vec((len + 2) / 3 * 4, '=');


The result should be stored as std::string, there is no reason to prefill it with =, you just push_back the characters.

push_back onto strings of very large lengths can be a performance issue. Ideally the space should be just reserved, rather than filled with =.

reserve can be called if there is a perf issue. I dont think its that important here.

Can't use std::string as is because the type being unsigned char is important.

could probably use basic_string<unsigned char>, but that's more code changes for what looks like a marginal benefit

pfultz2 · 2024-12-19T20:15:08Z

src/base64.cpp

+    std::size_t pad_cond = len % 3;
+    const size_t last    = len - pad_cond;
+
+    for(size_t i = 0; i < last; i += 3)


I would use the iterators directly for the loop:

for(auto it = buf.begin(); it < buf.end(); it += 3) { std::size_t n = to_int(it[0]) << 16u | to_int(it[1]) << 8u | to_int(it[2]); ... }

And add the to_int function to bit_cast it and convert it to std::size_t.

Actually it would be better to make a function to encode the triplet since you repeat the same below for the "padding":

template<class Input> std::array<char, 4> encode(Input input) { std::size_t n = to_int(it[0]) << 16u | to_int(it[1]) << 8u | to_int(it[2]); return {b64_chars.at(n >> 18u), b64_chars.at(n >> 12u & 0x3Fu), b64_chars.at(n >> 6u & 0x3Fu), b64_chars.at(n & 0x3Fu) }; }

Then the loop can do:

for(auto it = buf.begin(); it < buf.end(); it += 3) { copy(encode(it), std::back_inserter(result)); }

As far as iterator usage is concerned, lt or gt comparison is not recommended against buf.end(). It should strictly be used by an equality comparison: == or !=.

As far as iterator usage is concerned, lt or gt comparison is not recommended against buf.end()

Not recommended, by who? These are random access iterators so they support comparison operators just like pointers. The < operator should definitely be used here since we are skipping over by increments of 3, which means it could skip past the end, and it would become an infinite loop since we would never reach the end as we already past it.

But it could be UB if its past buf.end() so we probably need to do buf.end() - remaining to avoid that.

pfultz2 · 2024-12-19T20:40:52Z

src/base64.cpp

+        res_vec.at(j++) = b64_chars.at(pad_cond != 0 ? n >> 10u & 0x3Fu : n >> 2u);
+        res_vec.at(j++) = b64_chars.at(pad_cond != 0 ? n >> 4u & 0x03Fu : n << 4u & 0x3Fu);
+        res_vec.at(j++) = pad_cond != 0 ? b64_chars.at(n << 2u & 0x3Fu) : '=';
+    }


You can reuse the encode function so it doesnt repeat the code. You can also copy the chars into an array(this way pad_cond is not getting sublty modified):

assert(pad_cond < 3); std::array<char, 3> triple = {0}; // Get the remaining characters to encode std::copy(buf.end() - pad_cond, buf.end(), triple.begin()); auto e = encode(triple); // Add the encoded characters std::copy(e.begin(), e.begin() + 1 + pad_cond, std::back_inserter(result)); // Pad string with `=` result.append(3 - pad_cond, '=');

CharlieL7 · 2024-12-19T21:31:40Z

I'm not inclined the rewrite the base64 encode code for a 5th time so I'm going to leave it as is for now. We can improve on it later..

src/include/migraphx/base64.hpp

test/base64_test.cpp

pfultz2 · 2024-12-19T22:08:30Z

test/base64_test.cpp

+    std::string expected{"AAAA"};
+    std::string actual{migraphx::b64_encode({input.begin(), input.end()})};
+    EXPECT(expected == actual);
+}


Also the variables should be inlined: EXPECT("AAAA" == migraphx::b64_encode({'\x00', '\x00', '\x00' }).

src/netron_output.cpp

CharlieL7 added 12 commits April 30, 2024 20:04

initial

fc41a6f

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

7f79ce2

… into onnx_json_output

something

0796820

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

0c01550

… into onnx_json_output

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

b49a762

… into onnx_json_output

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

0dd2876

… into onnx_json_output

more progress

74870fb

first draft

25e76a0

First functional

e7a40ab

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

0e16a1a

… into onnx_json_output

handle attributes

a375111

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

fe76f7e

… into onnx_json_output

CharlieL7 self-assigned this Dec 16, 2024

CharlieL7 requested a review from causten as a code owner December 16, 2024 21:30

CharlieL7 marked this pull request as draft December 16, 2024 21:30

pfultz2 reviewed Dec 16, 2024

View reviewed changes

src/netron_output.cpp Show resolved Hide resolved

pfultz2 reviewed Dec 16, 2024

View reviewed changes

src/netron_output.cpp Outdated Show resolved Hide resolved

pfultz2 reviewed Dec 16, 2024

View reviewed changes

src/netron_output.cpp Outdated Show resolved Hide resolved

PR updates, base64 encode only and tests

552638b

CharlieL7 marked this pull request as ready for review December 18, 2024 21:26

Merge branch 'develop' into onnx_json_output

1d89e41

CharlieL7 requested review from lakhinderwalia, TedThemistokleous and pfultz2 December 18, 2024 21:26

TedThemistokleous approved these changes Dec 18, 2024

View reviewed changes

TedThemistokleous added enhancement New feature or request roadmap Tasks to finish for a release labels Dec 18, 2024

lakhinderwalia reviewed Dec 18, 2024

View reviewed changes

test/base64_test.cpp Show resolved Hide resolved

Merge branch 'onnx_json_output' of github.com:ROCmSoftwarePlatform/AM…

b3f99eb

…DMIGraphX into onnx_json_output

CharlieL7 requested a review from lakhinderwalia December 18, 2024 23:30

lakhinderwalia reviewed Dec 19, 2024

View reviewed changes

src/base64.cpp Outdated Show resolved Hide resolved

lakhinderwalia reviewed Dec 19, 2024

View reviewed changes

src/base64.cpp Outdated Show resolved Hide resolved

fix tidy and another base64 refactor

f89dfd5

CharlieL7 requested a review from lakhinderwalia December 19, 2024 15:35

CharlieL7 added 3 commits December 19, 2024 09:37

Licensing

965644a

Merge branch 'develop' of github.com:ROCmSoftwarePlatform/AMDMIGraphX…

937836f

… into onnx_json_output

more tidy fixes

cf9271b

lakhinderwalia reviewed Dec 19, 2024

View reviewed changes

src/base64.cpp Outdated Show resolved Hide resolved

src/base64.cpp Outdated Show resolved Hide resolved

lakhinderwalia approved these changes Dec 19, 2024

View reviewed changes

src/netron_output.cpp Show resolved Hide resolved

CharlieL7 added 2 commits December 19, 2024 11:40

Codecov ignore, tidy fixes

a195d07

even more tidy fixes

a758744